The reader should be familiar with basic linear algebra concepts.
The reader should be able to understand and check the facts about eigenvalue decomposition.
There are many excellent books on the subject. Here we list a few:
J. W. Demmel, Applied Numerical Linear Algebra
G. H. Golub and C. F. Van Loan, Matrix Computations
N. Higham, Accuracy and Stability of Numerical Algorithms
L. Hogben, ed., Handbook of Linear Algebra
B. N. Parlett, The Symmetric Eigenvalue Problem
G. W. Stewart, Matrix Algorithms, Vol. II: Eigensystems
For more details and the proofs of the Facts below, see L. M. DeAlba, Determinants and Eigenvalues and the references therein.
We state the basic definitions:
Let $\mathbb{F}=\mathbb{R}$ or $F=\mathbb{C}$ and let $A\in \mathbb{F}^{n\times n}$ with elements $a_{ij}\in \mathbb{F}$.
An element $\lambda \in \mathbb{F}$ is an eigenvalue of $A$ if $\exists x\in \mathbb{F}^n$, $x\neq 0$ such that
$$ Ax=\lambda x, $$and $x$ is an eigenvector of $\lambda$.
Characteristic polynomial of $A$ is $p_A(x)=\det(A-xI)$.
Algebraic multiplicty, $\alpha(\lambda)$, is the multiplicity of $\lambda$ as a root of $p_A(x)$.
Spectrum of $A$, $\sigma(A)$, is the multiset of all eigenvalues of $A$, with each eigenvalue appearing $\alpha(A)$ times.
Spectral radius of $A$ is
$$\rho(A)=\max\{|\lambda|, \lambda \in \sigma(A)\}. $$Eigenspace of $\lambda$ is
$$ E_{\lambda}(A)=\ker(A-\lambda I). $$Geometric multiplicity of $\lambda$ is
$$\gamma(\lambda)=\dim(E_{\lambda}(A)). $$$\lambda$ is simple if $\alpha(\lambda)=1$.
$\lambda$ is semisimple if $\alpha(\lambda)=\gamma(\lambda)$.
$A$ is nonderogatory if $\gamma(\lambda)=1$ for all $\lambda$.
$A$ is nondefective if every $\lambda$ is semisimple.
$A$ is diagonalizable if there exists nonsingular $B$ matrix and diagonal matrix $D$ such that $A=BDB^{-1}$.
Trace of $A$ is
$$\mathop{\mathrm{tr}}(A)=\sum_i a_{ii}.$$$Q\in\mathbb{C}^{n\times n}$ is unitary if $Q^*Q=QQ^*=I$, where $Q^*=(\bar Q)^T$.
Schur decomposition of $A$ is $A=QTQ^*$, where $Q$ is unitary and $T$ is upper triangular.
$A$ and $B$ are similar if $B=QAQ^{-1}$ for some nonsingular matrix $Q$.
$A$ is normal if $AA^*=A^*A$.
There are many facts related to the eigenvalue problem for general matrices. We state some basic ones:
$\lambda\in\sigma(A) \Leftrightarrow p_A(\lambda)=0$.
Cayley-Hamilton Theorem. $p_A(A)=0$.
A simple eigenvalue is semisimple.
$\mathop{\mathrm{tr}}(A)=\sum_{i=1}^n \lambda_i$.
$\det(A)=\prod_{i=1}^n \lambda_i$.
$A$ is singular $\Leftrightarrow$ $\det(A)=0$ $\Leftrightarrow$ $0\in\sigma(A)$.
If $A$ is triangular, $\sigma(A)=\{a_{11},a_{22},\ldots,a_{nn}\}$.
For $A\in\mathbb{C}^{n\times n}$, $\lambda\in\sigma(A)$ $\Leftrightarrow$ $\bar\lambda\in\sigma(A^*)$.
Corollary of the Fundamental theorem of algebra. For $A\in\mathbb{R}^{n\times n}$, $\lambda\in\sigma(A)$ $\Leftrightarrow$ $\bar\lambda\in\sigma(A)$.
If $(\lambda,x)$ is an eigenpair of a nonsingular $A$, then $(1/\lambda,x)$ is an eigenpair of $A^{-1}$.
Eigenvectors corresponding to distinct eigenvalues are linearly independent.
$A$ is diagonalizable $\Leftrightarrow$ $A$ is nondefective $\Leftrightarrow$ $A$ has $n$ linearly independent eigenvectors.
Every $A$ has Schur decomposition. Moreover, $T_{ii}=\lambda_i$.
If $A$ is normal, matrix $T$ from its Schur decomposition is normal. Consequently:
If $A$ and $B$ are similar, $\sigma(A)=\sigma(B)$. Consequently, $\mathop{\mathrm{tr}}(A)=\mathop{\mathrm{tr}}(B)$ and $\det(A)=\det(B)$.
Eigenvalues and eigenvectors are continous and differentiable: if $\lambda$ is a simple eigenvalue of $A$ and $A(\varepsilon)=A+\varepsilon E$ for some $E\in F^{n\times n}$, for small $\varepsilon$ there exist differentiable functions $\lambda(\varepsilon)$ and $x(\varepsilon)$ such that
In [1]:
using SymPy
In [2]:
A=[-3 7 -1; 6 8 -2; 72 -28 19]
Out[2]:
In [3]:
@vars x
Out[3]:
In [4]:
using LinearAlgebra
eye(n)=Matrix{Int}(I,n,n)
A-x*eye(3)
Out[4]:
In [5]:
# Characteristic polynomial p_A(λ)
p(x)=det(A-x*eye(3))
p(x)
Out[5]:
In [6]:
# Characteristic polynomial in nicer form
p(x)=factor(det(A-x*eye(3)))
p(x)
Out[6]:
In [7]:
λ=solve(p(x),x)
Out[7]:
The eigenvalues are $\lambda_1=-6$ and $\lambda_2=15$ with algebraic multiplicities $\alpha(\lambda_1)=1$ and $\alpha(\lambda_2)=2$.
In [8]:
g=nullspace(map(Float64,A-λ[1]*eye(3)))
Out[8]:
In [9]:
h=nullspace(map(Float64,A-λ[2]*eye(3)))
Out[9]:
The geometric multiplicities are $\gamma(\lambda_1)=1$ and $\gamma(\lambda_2)=1$. Thus, $\lambda_2$ is not semisimple, therefore $A$ is defective and not diagonalizable.
In [10]:
# Trace and determinant
tr(A), λ[1]+λ[2]+λ[2]
Out[10]:
In [11]:
det(A), λ[1]*λ[2]*λ[2]
Out[11]:
In [12]:
# Schur decomposition
T,Q=schur(A)
T
Out[12]:
In [13]:
Q
Out[13]:
In [14]:
# ?schur
In [15]:
F=schur(A)
fieldnames(typeof(F))
Out[15]:
In [16]:
F.Z
Out[16]:
In [17]:
println(diag(T))
In [18]:
Q'*Q
Out[18]:
In [19]:
Q*Q'
Out[19]:
In [20]:
# Similar matrices
M=rand(-5:5,3,3)
B=M*A*inv(M)
eigvals(B), tr(B), det(B)
Out[20]:
In [21]:
A=[57 -21 21; -14 22 -7; -140 70 -55]
Out[21]:
In [22]:
p(x)=factor(det(A-x*eye(3)))
p(x)
Out[22]:
In [23]:
λ=solve(p(x),x)
Out[23]:
In [24]:
h=nullspace(map(Float64,A-λ[2]*eye(3)))
Out[24]:
In [25]:
F=schur(A)
F.T
Out[25]:
In [26]:
using Random
Random.seed!(123)
A=rand(-4:4,4,4)
Out[26]:
In [27]:
p(x)=factor(det(A-x*eye(4)))
p(x)
Out[27]:
In [28]:
λ=solve(p(x),x)
Out[28]:
In [29]:
length(λ)
Out[29]:
Since all eigenvalues are distinct, they are all simple and the matrix is diagonalizable. With high probability, all eigenvalues of a random matrix are simple.
Do not try to use nullspace()
here.
In [30]:
A=rand(4,4)
p(x)=factor(det(A-x*eye(4)))
p(x)
Out[30]:
In this case, symbolic computation does not work well with floating-point numbers - the degree of $p_A(x)$ is 8 instead of 4.
Let us try Rational
numbers:
In [31]:
A=map(Rational,A)
Out[31]:
In [32]:
p(x)=factor(det(A-x*eye(4)))
p(x)
Out[32]:
In [33]:
λ=solve(p(x),x)
Out[33]:
In [34]:
length(λ)
Out[34]:
For more details, see A. Böttcher and I. Spitkovsky, Special Types of Matrices and the references therein.
Given $a_0,a_1,\ldots,a_{n-1}\in \mathbb{C}$, the circulant matrix is
$$ C(a_0,a_1,\ldots,a_{n-1})=\begin{bmatrix} a_0 & a_{n-1} & a_{n-2} & \cdots & a_{1} \\ a_1 & a_0 & a_{n-1} & \cdots & a_{2} \\ a_2 & a_1 & a_{0} & \cdots & a_{3} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ a_{n-1} & a_{n-2} & a_{n-3} & \cdots & a_{0} \end{bmatrix}. $$Let $a(z)=a_0+a_1z+a_2z^2+\cdots +a_{n-1}z^{n-1}$ be the associated complex polynomial.
Let $\omega_n=\exp\big(\displaystyle\frac{2\pi i}{n}\big)$. The Fourier matrix of order $n$ is
$$ F_n=\frac{1}{\sqrt{n}} \bigg[ \omega_n^{(j-1)(k-1)}\bigg]_{j,k=1}^n= \frac{1}{\sqrt{n}} \begin{bmatrix} 1 & 1 & \cdots & 1 \\ 1& \omega_n & \omega_n^2 & \cdots \omega_n^{n-1} \\ 1& \omega_n^2 & \omega_n^4 & \cdots \omega_n^{2(n-1)} \\ \vdots & \vdots & \ddots & \vdots \\ 1& \omega_n^{n-1} & \omega_n^{2(n-1)} & \cdots \omega_n^{(n-1)(n-1)} \end{bmatrix}. $$Fourier matrix is unitary. Fourier matrix is a Vandermonde matrix,
$$F_n=\displaystyle\frac{1}{\sqrt{n}} V(1,\omega_n,\omega_n^2,\ldots, \omega_n^{n-1}). $$Circulant matrix is normal and, thus, unitarily diagonalizable, with the eigenvalue decomposition
$$ C(a_0,a_1,\ldots,a_{n-1})=F_n^*\mathop{\mathrm{diag}}[(a(1),a(\omega_n),a(\omega_n^2),\ldots, a(\omega_n^{n-1})]F_n. $$We shall use the package SpecialMatrices.jl.
In [35]:
# using Pkg; Pkg.add(PackageSpec(name="SpecialMatrices",rev="master"))
In [37]:
using SpecialMatrices
using Polynomials
In [38]:
varinfo(SpecialMatrices)
Out[38]:
In [39]:
import Random
Random.seed!(127)
n=6
a=rand(-9:9,n)
Out[39]:
In [40]:
C=Circulant(a)
Out[40]:
In [41]:
# Check for normality
C*C'-C'*C
Out[41]:
In [42]:
Cm=Matrix(C)
Out[42]:
In [43]:
Cm*Cm'-Cm'*Cm
Out[43]:
In [48]:
p1=Poly(a)
Out[48]:
In [49]:
# Polynomials are easy to manipulate
p2=p1*p1
Out[49]:
In [50]:
# n-th complex root of unity
ω=exp(2*π*im/n)
Out[50]:
In [51]:
v=[ω^k for k=0:n-1]
F=Vandermonde(v)
Out[51]:
In [52]:
Fn=Matrix(F)/sqrt(n)
Λ=Fn*C*Fn'
Out[52]:
In [54]:
# Check
[diag(Λ) p1.(v) eigvals(Matrix(C))]
Out[54]:
For more details and the proofs of the Facts below, see W. Barrett, Hermitian and Positive Definite Matrices and the references therein.
Matrix $A\in \mathbb{C}^{n\times n}$ is Hermitian or self-adjoint if $A^*=A$, or element-wise, $\bar a_{ij}=a_{ji}$. We say $A\in\mathcal{H}_n$.
Matrix $A\in \mathbb{R}^{n\times n}$ is symmetric if $A^T=A$, or element-wise, $a_{ij}=a_{ji}$. We say $A\in\mathcal{S}_n$.
Rayleigh qoutient of $A\in\mathcal{H}_n$ and nonzero vector $x\in\mathbb{C}^n$ is
$$ R_A(x)=\frac{x^*Ax}{x^*x}. $$Matrices $A,B \in\mathcal{H}_n$ are congruent if there exists nonsingular matrix $C$ such that $B=C^*AC$.
Inertia of $A\in\mathcal{H}_n$ is the ordered triple $$\mathop{\mathrm{in}}(A)=(\pi(A),\nu(A),\zeta(A)),$$
where $\pi(A)$ is the number of positive eigenvalues of $A$, $\nu(A)$ is the number of negative eigenvalues of $A$, and $\zeta(A)$ is the number of zero eigenvalues of $A$.
Gram matrix of a set of vectors $x_1,x_2,\ldots,x_k\in\mathbb{C}^{n}$ is the matrix $G$ with entries $G_{ij}=x_i^*x_j$.
Assume $A$ is Hermitian and $x\in\mathbb{C}^n$ is nonzero. Then
Real symmetric matrix is Hermitian, and real Hermitian matrix is symmetric.
Hermitian and real symmetric matrices are normal.
$A+A^*$, $A^*A$, and $AA^*$ are Hermitian.
If $A$ is nonsingular, $A^{-1}$ is Hermitian.
Main diagonal entries of $A$ are real.
Matrix $T$ from the Schur decomposition of $A$ is Hermitian. Consequently:
Spectral Theorem. To summarize:
if $A\in\mathcal{H}_n$ with eigenpairs $(\lambda_i,u_i)$, then
$$ A=\lambda_1u_1u_1^*+\lambda_2 u_2u_2^* +\cdots +\lambda_n u_nu_n^*.$$
similarly, if $A\in\mathcal{S}_n$, then
$$ A=\lambda_1u_1u_1^T+\lambda_2 u_2u_2^T +\cdots +\lambda_n u_nu_n^T.$$
and, in particular,
$$ \lambda_j(A)+\lambda_n(B) \leq \lambda_j(A+B) \leq \lambda_j(A)+\lambda_1(B),\quad \textrm{for} \ j=1,2,\ldots,n. $$$\pi(A)+\nu(A)+\zeta(A)=n$.
$\mathop{\mathrm{rank}}(A)=\pi(A)+\nu(A)$.
If $A$ is nonsingular, $\mathop{\mathrm{in}}(A)=\mathop{\mathrm{in}}(A^{-1})$.
If $A,B \in\mathcal{H}_n$ are similar, $\mathop{\mathrm{in}}(A)=\mathop{\mathrm{in}}(B)$.
Sylvester's Law of Inertia. $A,B \in\mathcal{H}_n$ are congruent if and only if $\mathop{\mathrm{in}}(A)=\mathop{\mathrm{in}}(B)$.
Subadditivity of Inertia. For $A,B \in\mathcal{H}_n$,
In [55]:
# Generating Hermitian matrix
Random.seed!(432)
n=4
A=rand(ComplexF64,n,n)
A=A+adjoint(A)
Out[55]:
In [56]:
ishermitian(A)
Out[56]:
In [57]:
# Diagonal entries
diag(A)
Out[57]:
In [58]:
# Schur decomposition
F=schur(A)
Out[58]:
In [59]:
F.T
Out[59]:
In [60]:
λ,U=eigen(A)
Out[60]:
In [61]:
λ
Out[61]:
In [62]:
# Spectral theorem
norm(A-U*Diagonal(λ)*U')
Out[62]:
In [63]:
# Spectral theorem
A-sum([λ[i]*U[:,i]*U[:,i]' for i=1:n])
Out[63]:
In [64]:
# Cauchy Interlace Theorem (repeat several times)
@show i=rand(1:n)
μ=eigvals(A[[1:i-1;i+1:n],[1:i-1;i+1:n]])
Out[64]:
In [65]:
λ
Out[65]:
In [66]:
# Inertia
inertia(A)=[sum(eigvals(A).>0), sum(eigvals(A).<0), sum(eigvals(A).==0)]
Out[66]:
In [67]:
inertia(A)
Out[67]:
In [68]:
# Similar matrices
C=rand(ComplexF64,n,n)
B=C*A*inv(C)
eigvals(B)
Out[68]:
In [69]:
B
Out[69]:
This did not work numerically due to rounding errors!
In [70]:
# Congruent matrices - this does not work either, without some preparation
B=C'*A*C
inertia(A)==inertia(B)
In [71]:
# However,
inertia(A)==inertia(Hermitian(B))
Out[71]:
In [72]:
# or,
inertia((B+B')/2)
Out[72]:
In [73]:
# Weyl's Inequalities
B=rand(ComplexF64,n,n)
B=(B+B')/10
@show λ
μ=eigvals(B)
γ=eigvals(A+B)
μ,γ
Out[73]:
In [74]:
# Theorem uses different order!
j=rand(1:n)
k=rand(1:n)
@show j,k
if j+k<=n+1
@show sort(γ,rev=true)[j+k-1], sort(λ,rev=true)[j]+sort(μ,rev=true)[k]
end
if j+k>=n+1
sort(γ,rev=true)[j+k-n], sort(λ,rev=true)[j]+sort(μ,rev=true)[k]
end
In [75]:
sort(λ,rev=true)
Out[75]:
In [76]:
# Generating real symmetric matrix
Random.seed!(531)
n=6
A=rand(-9:9,n,n)
A=A+A'
Out[76]:
In [77]:
issymmetric(A)
Out[77]:
In [78]:
T,Q=schur(A)
Out[78]:
In [79]:
T
Out[79]:
In [80]:
Q
Out[80]:
In [81]:
λ,U=eigen(A)
λ
Out[81]:
In [82]:
cond(A)
Out[82]:
In [83]:
U
Out[83]:
In [84]:
A-sum([λ[i]*U[:,i]*U[:,i]' for i=1:n])
Out[84]:
In [85]:
inertia(A)
Out[85]:
In [86]:
C=rand(n,n)
inertia(C'*A*C)
Out[86]:
These matrices are an important subset of Hermitian or real symmteric matrices.
Matrix $A\in\mathcal{H}_n$ is positive definite (PD) if $x^*Ax>0$ for all nonzero $x\in\mathbb{C}^n$.
Matrix $A\in\mathcal{H}_n$ is positive semidefinite (PSD) if $x^*Ax\geq 0$ for all nonzero $x\in\mathbb{C}^n$.
$A\in\mathcal{S}_n$ is PD if $x^TAx>0$ for all nonzero $x\in \mathbb{R}^n$, and is PSD if $x^TAx\geq 0$ for all $x\in \mathbb{R}^n$.
If $A,B\in \mathrm{PSD}_n$, then $A+B\in \mathrm{PSD}_n$. If, in addition, $A\in \mathrm{PD}_n$, then $A+B\in \mathrm{PD}_n$.
If $A\in \mathrm{PD}_n$, then $\mathop{\mathrm{tr}}(A)>0$ and $\det(A)>0$.
If $A\in \mathrm{PSD}_n$, then $\mathop{\mathrm{tr}}(A)\geq 0$ and $\det(A)\geq 0$.
Any principal submatrix of a PD matrix is PD. Any principal submatrix of a PSD matrix is PSD. Consequently, all minors are positive or nonnegative, respectively.
$A\in\mathcal{H}_n$ is PD iff every leading principal minor of $A$ is positive. $A\in\mathcal{H}_n$ is PSD iff every principal minor is nonnegative.
For $A\in \mathrm{PSD}_n$, there exists unique PSD $k$-th root, $A^{1/k}=U\Lambda^{1/k} U^*$.
Cholesky Factorization. $A\in\mathcal{H}_n$ if PD iff there is an invertible lower triangular matrix $L$ with positive diagonal entries such that $A=LL^*$.
Gram matrix is PSD. If the vectors are linearly independent, Gram matrix is PD.
In [87]:
# Generating positive definite matrix as a Gram matrix
n=5
A=rand(n,n)+im*rand(n,n)
A=A*A'
Out[87]:
In [88]:
ishermitian(A)
Out[88]:
In [89]:
eigvals(A)
Out[89]:
In [90]:
# Positivity of principal leading minors
[det(A[1:i,1:i]) for i=1:n]
Out[90]:
In [91]:
# Square root
λ,U=eigen(A)
Ar=U*Diagonal(λ)*U'
A-Ar
Out[91]:
In [92]:
# Cholesky factorization - the upper triangular factor is returned
L=cholesky(A).U
Out[92]:
In [93]:
norm(A-L'*L)
Out[93]:
In [94]:
# Generating positive semidefinite matrix as a Gram matrix, try it several times
n=6
m=4
A=rand(-9:9,n,m)
Out[94]:
In [95]:
A=A*A'
Out[95]:
In [96]:
# There are rounding errors!
eigvals(A)
Out[96]:
In [97]:
s=svdvals(A)
Out[97]:
In [98]:
s[1]*eps()
Out[98]:
In [99]:
rank(A)
Out[99]:
In [100]:
# Cholesky factorization - this can fail
L=cholesky(A).U
Covariance and correlation matrices are PSD.
Correlation matrix is diagonally scaled covariance matrix.
In [101]:
Random.seed!(651)
x=rand(10,5)
Out[101]:
In [102]:
using Statistics
A=cov(x)
Out[102]:
In [103]:
# Covariance matrix is a Gram matrix
(x.-mean(x,dims=1))'*(x.-mean(x,dims=1))/(size(x,1)-1)-A
Out[103]:
In [104]:
B=cor(x)
Out[104]:
In [105]:
# Diagonal scaling
D=1 ./sqrt.(diag(A))
Out[105]:
In [106]:
Diagonal(D)*A*Diagonal(D)
Out[106]:
In [107]:
eigvals(A)
Out[107]:
In [108]:
eigvals(B)
Out[108]:
In [109]:
C=cov(x')
Out[109]:
In [110]:
eigvals(C)
Out[110]:
In [111]:
inertia(C)
Out[111]:
In [112]:
rank(C)
Out[112]:
Explain the function rank()
.
In [113]:
@which rank(C)
Out[113]: